%%
%% This is file `mcmthesis-demo.tex',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% mcmthesis.dtx  (with options: `demo')
%%
%% -----------------------------------
%%
%% This is a generated file.
%%
%% Copyright (C)
%%       2010 -- 2015 by Zhaoli Wang
%%       2014 -- 2019 by Liam Huang
%%       2019 -- present by latexstudio.net
%%
%% This work may be distributed and/or modified under the
%% conditions of the LaTeX Project Public License, either version 1.3
%% of this license or (at your option) any later version.
%% The latest version of this license is in
%%   http://www.latex-project.org/lppl.txt
%% and version 1.3 or later is part of all distributions of LaTeX
%% version 2005/12/01 or later.
%%
%% This work has the LPPL maintenance status `maintained'.
%%
%% The Current Maintainer of this work is Liam Huang.
%%
%%
%% This is file `mcmthesis-demo.tex',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% mcmthesis.dtx  (with options: `demo')
%%
%% -----------------------------------
%%
%% This is a generated file.
%%
%% Copyright (C)
%%       2010 -- 2015 by Zhaoli Wang
%%       2014 -- 2019 by Liam Huang
%%       2019 -- present by latexstudio.net
%%
%% This work may be distributed and/or modified under the
%% conditions of the LaTeX Project Public License, either version 1.3
%% of this license or (at your option) any later version.
%% The latest version of this license is in
%%   http://www.latex-project.org/lppl.txt
%% and version 1.3 or later is part of all distributions of LaTeX
%% version 2005/12/01 or later.
%%
%% This work has the LPPL maintenance status `maintained'.
%%
%% The Current Maintainer of this work is Liam Huang.
%%
\documentclass{mcmthesis}
    \mcmsetup{CTeX = false,    % 使用 CTeX 套装时，设置为 true
          tcn = {2316191}, problem = \textcolor{red}{D},
          sheet = true, titleinsheet = true, keywordsinsheet = true,
          titlepage = false, abstract = false}
        
\usepackage{newtxtext}     % \usepackage{palatino}
% \usepackage[american]{babel} 
% \usepackage{csquotes} 
% \usepackage[style=mla, backend=bibtex]{biblatex}
% 	\addbibresource{./reference.bib}

\bibliographystyle{unsrt}

\usepackage{tocloft}
\usepackage{hyperref}
\setlength{\cftbeforesecskip}{6pt}
\renewcommand{\contentsname}{\hspace*{\fill}\Large\bfseries Contents \hspace*{\fill}}

\usepackage{algorithm2e}
\SetKwComment{Comment}{/* }{ */}

\usepackage{multirow}
\usepackage{graphicx}
\usepackage{float}
% \usepackage[backend=bibtex]{biblatex}

\newcommand{\Emph}[1]{\textbf{#1}}
\newcommand{\num}[1]{\textit{#1}}

\title{\vspace{-0.4em}Exploring Relationships Between SDGs\vspace{-0.2em}}
% \author{\small \href{http://www.latexstudio.net/}
%   {\includegraphics[width=7cm]{mcmthesis-logo}}}
\date{\today}

\begin{document}

\begin{abstract}
  % \linespread{0.95} \selectfont
  % \setlength{\parskip}{0.2em}
  \vspace{-0.9em}

  The United Nations (UN) has set 17 Sustainable Development Goals (SDGs), providing a shared blueprint for peace and prosperity of people and the planet, now and into the future. These goals are not independent, but influence and constrain each other. Taking the impact between these SDGs as a starting point, our paper takes a comprehensive view of various aspects using \Emph{statistical analysis}, \Emph{natural language processing}, \Emph{graph theory}, \Emph{operations research} and \Emph{social research} to study the related issues in depth and obtains meaningful results.

\Emph{In Problem 1}, we completed a reasonable modeling of the SDG network by analyzing \Emph{statistical data} and \Emph{textual data}. We defined the network as a \Emph{weighted undirected graph}, with each SDG as a node and the edge weights between two SDGs between $[-1,1]$, representing the influences between SDGs from mutual constraints to mutual facilitation. To obtain reasonable \Emph{statistics data}, we selected 1-2 indicators for each SDG (27 in total) from the 169 indicators defined by the United Nations for the 17 SDGs. After processing the obtained data, the correlation between SDGs can be obtained from a statistical data perspective. In order to make our network more reasonable and interpretable, we also introduced \Emph{textual data}, which is the standard framework established by the United Nations for SDGs. The similarity between texts can be obtained using the Sentence-BERT model. 

\Emph{In Problem 2}, we use the Analytic Hierarchy Process (AHP) method with three aspects of each SDG: \Emph{its own importance}, \Emph{structural importance}, and \Emph{difficulty}. The \Emph{importance of the} \Emph{SDG itself} was obtained by a survey of experts; the \Emph{structural importance} of the SDG was obtained by using the PageRank algorithm in the network built in Problem 1; and the \Emph{difficulty} of the SDG completion was obtained by processing relevant statistics. Using the AHP method, the three indicators can be reasonably integrated to define the priority for the SDG. Among them, \emph{GOAL 9: Industry, Innovation and Infrastructure}, \emph{GOAL 7: Affordable and Clean Energy}, and \emph{GOAL 16: Peace and Justice Strong Institutions} reflect a higher priority. In addition to this, we have projected what might be achieved 10 years after setting these goals.

\Emph{In Problem 3}, following the indicators and evaluation methods set in Problem 2, we discuss the implications of the completion of the SDG goals for our network structure and prioritization choices, using the poverty and hunger goals as specific examples. In addition to this, we also suggest other possibilities for SDG goal setting.

\Emph{In Problem 4}, we discuss the impact of specific factors (e.g., technological advances, global pandemics) on our model. These effects can be expressed in terms of the metrics we originally set, and through the constructed network and evaluation criteria, we can visualize the impact of these factors on priority selection and the UN process.

\Emph{In Problem 5}, we analyze the possibility of our model to set goal priorities for other companies and organizations.

Finally, we analyze the convergence and sensitivity of our model and illustrate the strengths and weaknesses of our model.

  \begin{keywords}
    Sustainable Development Goals; Graph theory; Natural Language Processing; Statistical analysis
  \end{keywords}

\end{abstract}

\maketitle

%% Generate the Table of Contents, if it's needed.
% \renewcommand{\contentsname}{\centering Contents}
\tableofcontents   % 若不想要目录, 注释掉该句
\thispagestyle{empty}

\newpage

\section{Introduction}

\subsection{Background}

In 2014, UN Member States proposed 17 Sustainable Development Goals (SDGs), which will succeed the Millennium Development Goals (MDGs) as reference goals for the international development community for the period 2015-2030.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.2\textwidth]{figures/UNSustainableDevelopmentGoals_Brand-02.png}
    \label{fig:intro}
\end{figure}

These goals are not independent, the accomplishment of one goal may have positive or negative effects on other goals. In order to obtain the linkages among these goals more comprehensively, and to guide states and governments in deciding the priorities of these goals, it is necessary to build a proper network among the goals and to view the linkages among the goals from a network perspective.

% \subsection{Literature Review}

\subsection{Problem Restatement}

\begin{itemize}
  \item {\bf Problem One}. Create a proper network among SDGs that represents the relationships between different SDGs and the importance of each SDG.

  \item {\bf Problem Two}. Based on the network built in Problem One, set the priority of each SDG and explain the evaluation criteria. Then estimate which of these SDGs are reasonable to achieve in the next 10 years after initiating our priorities.

  \item {\bf Problem Three}. If a particular SDG is achieved, the network architecture will change (i.e., when a specific node is eliminated from the network), describe how these changes affect the previously set priorities. And figure out whether additional new SDGs need to be added.

  \item {\bf Problem Four}. Explain how external factors (e.g., technological advances, global pandemics) affect the network and the priorities, and describe the impact of these external factors on achieving progress from a network perspective.

  \item {\bf Problem Five}. Describe how to expand the network we built to help other companies and organizations prioritize their goals.

\end{itemize}

\subsection{Our Work}

Our work mainly includes the following steps (figure \ref{fig:our_work}):

\begin{itemize}
  \item We constructed a process to build a relationship network. First, we collected relevant data for each target, and then we used the obtained data to calculate the inter-target correlation coefficients, and analyzed the content of each indicator for each target using natural language processing methods to obtain the similarity of meaning between each target, and finally multiplied the two to obtain the correlation between the targets. The correlation was used as the edge weight to establish the relationship network.
  \item We used the PageRank algorithm to analyze the network structure and combined with the collected data to calculate the priority degree of each node, and based on the priority degree, combined with the data, we obtained a reasonable goal that can be accomplished in 10 years.
  \item We calculated what the network will look like when a goal is completed and its effect on the priority level, and also analyzed which issue is suitable to join as a new goal.
  \item We discuss the impact of emergent phenomena such as technological advances, global pandemics, climate change, regional wars, and refugee movements, or other international crises on the network and target prioritization.
  \item Finally, we discuss where our model can help companies and other organizations prioritize their goals.
\end{itemize}

\begin{figure}[H]
  \centering
  \includegraphics[width=0.78\textwidth]{figures/our_work.png}
  \caption{Flow chart of our work}
  \label{fig:our_work}
\end{figure}

% <img src=".\latex\figures\our_work.png" style="zoom:50%;" />

\section{Assumptions and Justification}

To simplify the problem and make it convenient for us to estimate real-world conditions, we make the following basic assumptions, each of which is properly justified.

\begin{itemize}
  \item {\bf Data Validity.} The data we use in this report are all obtained from the Internet, mainly through the official website of the United Nations and other international organizations, the validity of which we cannot fully confirm. However, based on the international recognition of the relevant organizations, we can assume that the data we obtained is true and valid.

  \item {\bf Other Statements.} The conclusions drawed in this report are only obtained from the models we build, which have limitations, and do not represent any individual or collective views. The obtained results will be further discussed.

\end{itemize}

\section{Notations}

\begin{table}[H]
  \centering
  \renewcommand{\arraystretch}{1.5}
  \begin{tabular}{m{0.15\textwidth}<{\raggedright}m{0.6\textwidth}<{\raggedright}}
    $G$           & Graph(the same as Network)                             \\
    $V$           & Vertices  \\
    $E$           & Edges(Undirected)                                      \\
    $w$           & Weight of Undirected Edges                             \\
    $\mathbf{rs}$ & correlation coefficient obtained from statistical data \\
    $\mathbf{rt}$ & similarity coefficient obtained from textual data      \\
    $\mathbf{It}$ & Target Importance Index \\
    $\mathbf{Is}$ & Target Structure Importance Index \\
    $\mathbf{Id}$ & Target Difficulty Index \\
    $\mathbf{p}$ & Priority vector \\
  \end{tabular}
\end{table}

If not specified, all bolded variables in the following(i.e. $\mathbf{It}$) are vectors, those with the same symbol as the former but in italics and with subscripts(i.e. $It_i$) indicate the component of the $i$-th dimension of the vector.

\section{Problem One: Building SDG Networks}

In this problem, we attempt to build a network to represent the relationships between SDGs. We consider each SDG as a node in the network, and the connected edge between two nodes represents the mutual influence between two SDGs. Notice that the impacts between SDGs may be synergistic or mutually constraining, and that the degree of impact varies across SDGs, we will also introduced edge weights to quantify these impacts.

Formally, the network we will build is a \Emph{weighted undirected graph} $G=(V, E)$, where $V=\{ v_1, v_2, \ldots , v_{17}\}$ are the vertices representing the 17 different SDGs, $E=\{ (u, v, w)|u \in V, v \in V,u\neq v,w \in \mathbb{R}\}$ are the edges between these vertices, where $u,v$ are the vertices representing two different SDGs and $w$ quantifies the influence between them. We specify $w\in[-1, 1]$, where $w$ represents a synergistic relationship between these two corresponding SDGs if $w > 0$, and represents a trade-off relationship between these two corresponding SDGs if $w < 0$. A larger value of $|w|$ indicates a greater degree of this influence.

Inspired by these papers\cite{leblancIntegrationLastSustainable2015}\cite{breuerTranslatingSustainableDevelopment2019}\cite{pradhanSystematicStudySustainable2017a}, we  define reasonable weights for the whole network from two perspectives: \Emph{statistical data} and \Emph{textual information}. 

First is the statistical data perspective. When the United Nations developed 17 SDGs, it defined a series of indicators to try to quantify the degree of accomplishment of the various SDG goals in each country and around the world. We found the performance of individual countries in the world on these indicators over the past decades from the official websites of the relevant organizations. Using the method of statistical correlation analysis, we can get a quantitative correlation between different SDGs from this perspective. However, this approach has significant limitations; statistically significant correlations do not directly represent the degree of influence that in fact exists between them. Two variables that are not directly correlated (e.g., beer sales and the number of married people in the U.S. during 1960-1985\cite{smith2014standard}, shown in figure \ref{fig:beer_and_marry}) can also show strong statistically significant correlations, so we need another perspective to give more reasonable weights to the network we have constructed.

\begin{figure}[H]
  \centering
  \includegraphics[width=0.75\textwidth]{figures/beer and marry.png}
  \caption{A typical example of "correlation is different from causality"}%一个典型的“相关性不同于因果性”的例子
  \label{fig:beer_and_marry}
\end{figure}

With development of Natural Language Processing (NLP) technology, we can also use Textual Information to analyze the degree of influence between different SDGs. We can obtain the embedding representation of an SDG from the text directly related to that SDG, and the angle between the embedding vectors of content with similar meaning is smaller and vice versa. The cosine similarity between these embedding representations can be used to obtain the degree of correlation between different SDGs. Combined with the regression coefficients obtained in the previous section, we can then obtain reasonable values for the edge weights in the network.

\subsection{Statistical Data Analysis}

\subsubsection{Source of SDG Completion Metrics Data}

According to the United Nations, a series of indicators are set for each SDG, which can quantify the progress of each country's SDG completion\cite{UNSDSDGsAPI}\cite{SustainableDevelopmentReport}, and we need to extract valuable information from them to complete the calculation of the relationship between SDGs.

For each SDG, we selected one or two representative indicators from the perspectives of data completeness and importance of indicators, as shown in the table \ref{tab:selected_indicators}.

\begin{table}[H]
  \centering
  \caption{Selected indicators}
  \label{tab:selected_indicators}
  \resizebox{\textwidth}{!}{%
    \begin{tabular}{lll}
      \toprule
      SDG                                                         & Target & Description                                                          \\
      \midrule
      1: No Poverty                                               &
      0                                                           &
      \textit{1.1 Poverty headcount ratio at \$1.90/day}                                                                                          \\
      \multirow{2}{*}{2: Zero Hunger}                             &
      0                                                           &
      \textit{2.1 Prevalence of undernourishment}                                                                                                 \\
                                                                  &
      0                                                           &
      \textit{2.2 Prevalence of stunting in children under 5 years of age}                                                                        \\
      \multirow{2}{*}{3: Good Health and Well-being}              &
      70                                                          &
      \textit{3.1 Maternal mortality rate}                                                                                                        \\
                                                                  &
      25                                                          &
      \textit{3.2 Mortality rate, under-5}                                                                                                        \\
      \multirow{2}{*}{4: Quality Education}                       &
      100                                                         &
      \textit{4.1 Participation rate in pre-primary organized learning}                                                                           \\
                                                                  &
      100                                                         &
      \textit{4.2 Net primary enrollment rate}                                                                                                    \\
      \multirow{2}{*}{5: Gender Equality}                         &
      100                                                         &
      \textit{5.1 Ratio of female-to-male mean years of education received}                                                                       \\
                                                                  &
      100                                                         &
      \textit{5.2 Ratio of female-to-male labor force participation rate}                                                                         \\
      \multirow{2}{*}{6: Clean Water and Sanitation}              & 100    & \textit{6.1 Population using at least basic drinking water services} \\
                                                                  &
      100                                                         &
      \textit{6.2 Population using at least basic sanitation services}                                                                            \\
      7: Affordable and Clean Energy                              &
      100                                                         &
      \textit{7.1 Population with access to electricity}                                                                                          \\
      \multirow{2}{*}{8: Decent Work and Economic Growth}         &
      0                                                           &
      \textit{8.1 Victims of modern slavery}                                                                                                      \\
                                                                  &
      0.85                                                        &
      \textit{8.2 Fundamental labor rights are effectively guaranteed}                                                                            \\
      9: Industry, Innovation and Infrastructure                  &
      15                                                          &
      \textit{9.1 Researchers per 1000 people}                                                                                                    \\
      10: Reduced Inequality                                      &
      27.5                                                        &
      \textit{10.1 Gini coefficient}                                                                                                              \\
      \multirow{2}{*}{11: Sustainable Cities and Communities}     & 0      & \textit{11.1 Proportion of urban population living in slums}         \\
                                                                  &
      5                                                           &
      \textit{11.2 Annual mean concentration of PM2.5}                                                                                            \\
      \multirow{2}{*}{12: Responsible Consumption and Production} & 2      & \textit{12.1 Production-based nitrogen emissions}                    \\
                                                                  &
      0.6                                                         &
      \textit{12.2 Non-recycled municipal solid waste}                                                                                            \\
      \multirow{2}{*}{13: Climate Action}                         &
      100                                                         &
      \textit{13.1 Carbon Pricing Score at EUR60/tCO2}                                                                                            \\
                                                                  &
      0                                                           &
      \textit{13.2 CO2 emissions from fossil fuel combustion and cement production}                                                               \\
      14: Life Below Water                                        &
      100                                                         &
      \textit{14.1 Ocean Health Index: Clean Waters score}                                                                                        \\
      15: Life on Land                                            &
      1                                                           &
      \textit{15.1 Red List Index of species survival}                                                                                            \\
      \multirow{2}{*}{16: Peace and Justice Strong Institutions}  &
      0.3                                                         &
      \textit{16.1 Homicides}                                                                                                                     \\
                                                                  &
      88.6                                                        &
      \textit{16.2 Corruption Perceptions Index}                                                                                                  \\
      17: Partnerships to achieve the Goal                        &
      15                                                          &
      \textit{17.1 Government spending on health and education}                                                                                   \\
      \bottomrule
    \end{tabular}%
  }
\end{table}

For specific methods of obtaining, processing, and cleaning these data, see Appendix \ref{app:data_cleaning}.

For each indicator in the table above, we obtained specific values for each country for consecutive years, calculated in proportion to the population to obtain the overall world indicator, which was used to calculate the correlation between each indicator.

\subsubsection{Correlation Analysis Method}

In statistics, correlation or dependence refers to any statistical relationship between two random variables or bivariate data\cite{Correlation2022}.

Given a series of $n$ measurements of the pair $(X_i, Y_i)$ indexed by $i = 1, \dots, n$, the \emph{sample correlation coefficient} $rs_{X,Y}$ can be used to estimate the population Pearson correlation $\rho_{X,Y}$, which is defined as:

\begin{equation}
  rs_{X,Y} =\frac{\Sigma_{i=1}^{n}(X_i - \overline{X})(Y_i - \overline{Y})}{\sqrt{\Sigma_{i=1}^{n}(X_i - \overline{X})^2 \Sigma_{i=1}^{n}(Y_i - \overline{Y})^2}},
\end{equation}
where $\overline{X}$ and $\overline{Y}$ are the sample means of $X$ and $Y$.

\subsubsection{Results and Analysis}

We calculate the \emph{correlation coefficient obtained from statistical data} $\mathbf{rs}$ between each pair of SDGs according to equation SITE (if an SDG has more than one indicator, the arithmetic mean is taken), and the results obtained are shown in the figure \ref{fig:correlation_map}.

\begin{figure}[H]
  \centering
  \includegraphics[width=0.75\textwidth]{figures/rs.png}
  \caption{Correlation coefficient heat map from statistical data}
  \label{fig:correlation_map}
\end{figure}

% A typical example of "correlation is different from causality"

\subsection{Textual Data Analysis}

\subsubsection{Source of SDG Related Textual Data}

In order to get the accurate representation of each SDG, we need to obtain the most relevant and high-quality textual data of the corresponding SDG. Here we selected the standard document \emph{Global indicator framework for the Sustainable Development Goals and targets of the 2030 Agenda for Sustainable Development} from the official website of the United Nations, which has a large volume of text and basically meets our requirements for data quality (figure \ref{fig:official_sgds}). 

\begin{figure}[H]
  \centering
  \includegraphics[width=\textwidth]{figures/image-20230219155756162.png}
  \caption{Interpretation of UN official indicators of SDG1}%联合国官方的targets和indicators的解释
  \label{fig:official_sgds}
\end{figure}

\subsubsection{Natural Language Processing and the Sentence-BERT Model}

In the field of Natural Language Processing, a common method to address clustering and semantic search is to map each sentence to a vector space such that semantically similar sentences are close.

With the help of the Sentence-BERT Model with siamese network architecture\cite{reimersSentenceBERTSentenceEmbeddings2019}, fixed-sized vectors for input sentences can be derived. Using a similarity measure like cosine-similarity, Manhatten distance, Euclidean distance, we can obtain similarity of sentences and paragraphs.

The general framework for calculating the similarity between SDGs with the help of Sentence-BERT Model is shown in the following figure, the specific implementation of the model is shown in Appendix \ref{app:sbert}.

\begin{figure}[H]
  \centering
  \includegraphics[width=0.5\textwidth]{figures/sbert.png}
  \caption{Flow chart of getting $\mathbf{rt}$ from text data}%由textual data得到rt的流程图
  \label{fig:textual_to_rt}
\end{figure}

According to the framework in Figure \ref{fig:textual_to_rt}, each sentence describing the SDG can be represented as a vector of $384$ dimensions. After being normalized, each sentence can be considered as a point on a high-dimensional sphere, and sentences with similar meanings are closer together. After obtaining these vectors, we average the vectors representing the same SDG and then use cosine-similarity to calculate the similarity coefficient obtained from textual data $rt$ between each pair of SDGs.

The cosine-similarity of two vectors is defined by:

\begin{equation}
  CosineSim(\mathbf{v}_a, \mathbf{v}_b) = \frac{\mathbf{v}_a \cdot \mathbf{v}_b}{|\mathbf{v}_a||\mathbf{v}_a|}.
\end{equation}

\subsubsection{Results and Analysis}

In order to visualize the property that "sentences with similar meaning are closer together", we project the results on a two-dimensional plane as follows (figure \ref{fig:projection_2d}).

\begin{figure}[H]
  \centering
  \includegraphics[width=0.4\textwidth]{figures/sentences1.png}
  \includegraphics[width=0.4\textwidth]{figures/sentences.png}
  \caption{Sentence vectors projected on a two-dimensional plane}
  \label{fig:projection_2d}
\end{figure}

% <img src=".\latex\figures\sentences1.png" style="zoom: 20%;" />

% <img src=".\latex\figures\sentences.png" style="zoom: 20%;" />

The result of $rt$ is shown in the figure \ref{fig:similarity_coefficient}.

\begin{figure}[H]
  \centering
  \includegraphics[width=0.45\textwidth]{figures/rt.png}
  \caption{Similarity coefficient heat map}
  \label{fig:similarity_coefficient}
\end{figure}

\subsection{Network Construction}

After the above work, we can build the SDG network reasonably well. The network we build is a fully connected graph (i.e. $\forall u,v \in V, \exists w \in \mathbb{R} \wedge (u,v,w)\in W$) with weights defined as follows:

\begin{equation}
  \forall(u, v, w)\in E,w:=rd_{u,v}\times \max(rt_{u,v},0)
\end{equation}

The visualization of the network edge weights and the network structure is shown in figure \ref{fig:net_visualization}, where the thickness of the network edges represents the magnitude of their values.

\begin{figure}[H]
  \centering
  \includegraphics[width=0.4\textwidth]{figures/r.png}
  \includegraphics[width=0.4\textwidth]{figures/graph.png}
  \caption{Network heat map and visualization}
  \label{fig:net_visualization}
\end{figure}

Here is an illustration of the whole procedure of building the network (figure \ref{fig:build_network}).

\begin{figure}[H]
  \centering
  \includegraphics[width=\textwidth]{figures/network_procedure.png}
  \caption[short]{Flow chart of building the network}
  \label{fig:build_network}
\end{figure}

Table \ref{tab:3_relevant_sdgs} and table \ref{tab:3_constraining_sdgs} listed the three most relevant and the three most mutually constraining SGDs.

\begin{table}[H]
  \centering
  \caption{Three most relevant SDGs}
  \label{tab:3_relevant_sdgs}
  \resizebox{\textwidth}{!}{%
    \begin{tabular}{llc}
      \toprule
      SGDa                                            & SGDb                                      & $w$  \\
      \midrule
      GOAL 9: Industry, Innovation and Infrastructure & GOAL 17: Partnerships to achieve the Goal & 0.77 \\
      GOAL 9: Industry, Innovation and Infrastructure & GOAL 7: Affordable and Clean Energy       & 0.67 \\
      GOAL 9: Industry, Innovation and Infrastructure & GOAL 2: Zero Hunger                       & 0.62 \\
      \bottomrule
    \end{tabular}%
  }
\end{table}

\begin{table}[H]
  \centering
  \caption{Three most mutually constraining SDGs}
  \label{tab:3_constraining_sdgs}
  \begin{tabular}{llc}
    \toprule
    SGDa                  & SGDb                                            & $w$   \\
    \midrule
    GOAL 15: Life on Land & GOAL 6: Clean Water and Sanitation              & -0.62 \\
    GOAL 15: Life on Land & GOAL 9: Industry, Innovation and Infrastructure & -0.61 \\
    GOAL 15: Life on Land & GOAL 17: Partnerships to achieve the Goal       & -0.56 \\
    \bottomrule
  \end{tabular}%
\end{table}

\section{Problem Two: Ranking Target and Predicting Achievement}

SDG is an important strategic goal for each country. Deciding on priorities among SDGs requires a combination of considerations.

The first consideration is the importance of the goal itself, which is somewhat controversial and may include discussions about human rights and ethics. Here we use data from the website \emph{sdgsinorder.org}, obtained by surveying the recommendations of a certain number of experts, which guarantees a certain objectivity.

In question 1, we constructed a network between SDGs and could find that there are mutual influences between SDGs, and such influences include both positive and negative ones. We want to use these positive influences to maximize the benefits obtained from our efforts by setting reasonable goals. We used the PageRank algorithm to quantify the positive influence of different SDGs as a second indicator for setting the priority of SDGs.

In addition to importance and structural considerations, the urgency of SDG completion was also something we needed to consider. Some SDGs are still far from the 2030 target set by the UN, while others are close to completion; some SDGs are less difficult to complete, while others require greater effort. In section 5.3, we will consider these factors together as a third indicator for setting SDG priorities.

\subsection{Target Importance Index}

Here shows the importance of each SDG itself, obtained from the website SDG in Order\cite{Goals}(figure \ref{fig:sgd_in_order_outside}).

\begin{figure}[H]
  \centering
  \includegraphics[width=0.75\textwidth]{figures/SDG in order.png}
  \caption{Snapshot of expert scores on SDG in Order}
  \label{fig:sgd_in_order_outside}
\end{figure}

We need to normalize these metrics. We use $\mathbf{It}^{'}=(It_1^{'}, \dots,It_{17}^{'})$ to represent the importance metrics obtained on the website, where $It_n$ represents the importance metric of the $n$th SDG and $\mathbf{It}$ represents the normalized importance metric, the latter obtained by the following equation.

\begin{equation}
  \mathbf{It}=\frac{\mathbf{It}^{'}}{\Sigma_{i=1}^{17}It^{'}_i}
\end{equation}



\subsection{Target Structure Importance Index}

Noting the correlation between SDGs, we use the PageRank algorithm to compute the importance of each SDG in terms of structure.

First construct the transfer matrix $\mathbf{M}$ based on the network $G=(V,E)$ that we constructed in Section 4.3:

\begin{equation}
  M_{v,u}=\frac{e^{w_{v,u}}}{\Sigma_{s\in V}e^{w_{u,s}}}
\end{equation}

By algorithm \ref{alg:two}, we can obtain the structural importance of each SDG $\mathbf{Is}=(Is_1,\dots,Is_{17})$, where $Is_n$ denotes the structural importance of the $n$th SDG.

\RestyleAlgo{ruled}
\SetKwComment{Comment}{/* }{ */}
\begin{algorithm}[H]
  \caption{PageRank Algorithm}\label{alg:two}
  \KwData{Number of undirected graph nodes $n$, transfer matrix $M$, damping factor $d$, initial structural importance $\mathbf{Is}_0$, accuracy $\varepsilon$}
  \KwResult{Final structural importance}
  $count \gets 100$

  \While{t < count}{
    $\mathbf{Is}_{t}=d\mathbf{Is}_{t-1}M + (1-d) / n \cdot \mathbf{1}$

    \If{t $<= 0$ or $||\mathbf{Is}_{t} - \mathbf{Is}_{t-1}||_2 < \varepsilon$}{
      break
    }
    $t \gets t+1$
  }
\end{algorithm}

We normalized $\mathbf{Is}$ using the following operation:

\begin{equation}
  \mathbf{Is}\leftarrow\frac{\mathbf{Is}}{\Sigma_{i=1}^{17}Is_i}
\end{equation}

Here is the visualization of  $\mathbf{Is}$ (figure \ref{fig:Is}):

\begin{figure}[H]
  \centering
  \includegraphics[width=0.75\textwidth]{figures/Target Structural Importance Index.png}
  \caption{Bar Chart of Target Structure Importance Index $\mathbf{Is}$}
  \label{fig:Is}
\end{figure}

\subsection{Target Difficulty Index}

We can use the fastest growth rate to describe the efficiency of a positive indicator and the fastest decrease rate to describe the efficiency of a negative indicator. These two velocities can somewhat reflect the efficiency when we are fully committed to that goal.

The fastest growing speed $spd_p$ and the fastest decreasing speed $spd_n$ are calculated as follows:

\begin{align}
  spd_{p, d} = \max_t\{n_{d, t+1}-n_{d, t}\} \\
  spd_{n, d} = \min_t\{n_{d, t+1}-n_{d, t}\}
\end{align}

where $n_{d,t}$ represents the data of the target $d$ in the year $t$. Later, for the convenience of the narrative, $spd_d$ is used to uniformly denote the fastest proceeding speed of the target $d$, which is determined by its sign as the growth or reduction speed.

However, since different targets have different magnitudes and the rates between them cannot be compared horizontally, it is possible to consider the number of years after which these indicators can reach the intended target using the corresponding rates, i.e., to use time to describe the difficulty of completing the indicators. For an indicator $d$, its expected completion time (in years) is:

\begin{equation}
  t_d=\frac{T_d-I_{c, d}}{spd_d}
\end{equation}

The longer the expected time to complete the task, the more difficult it will be. In order to standardize the data size, we perform the normalize operation:

\begin{equation}
  t^*_d=\frac{t_d}{\max_d\{t_d\}}
\end{equation}

This equation remains imperfect as a measure of completion difficulty, because as we have found in processing the data, the calculated time can be negative. When the expected completion time for an indicator is negative, it means that the indicator is positive but its data has been decreasing, or negative but its data has been increasing. This can only mean that people are not making the right efforts in these areas, resulting in progress on these targets going backwards instead of forwards, and their completion should be more difficult. For such indicators, we use their slowest rate of change as their possible future rate to calculate their expected completion time. Since in the previous process of calculating the fastest speed, for such an indicator, its calculation results in exactly its slowest rate of change. Also, in order to ensure that its difficulty is greater than the indicator with positive time, taking $1+|t^*_d|$ as the difficulty will satisfy the requirement. Thus the final difficulty calculation formula is:

\begin{equation}
  Id_d = \begin{cases}
    t^*_d, & t^*_d>=0 \\1+|t^*_d|, &t^*_d<0
  \end{cases}
\end{equation}

We normalized $\mathbf{Id}$ using the following operation:

\begin{equation}
  \mathbf{Id}\leftarrow\frac{\mathbf{Id}}{\Sigma_{i=1}^{17}Id_i}
\end{equation}

Here is the visualization of  $\mathbf{Id}$ (figure \ref{fig:Id}):

\begin{figure}[H]
  \centering
  \includegraphics[width=0.75\textwidth]{figures/Target Difficulty Index.png}
  \caption{Bar Chart of Target Difficulty Index $\mathbf{Id}$}
  \label{fig:Id}
\end{figure}

\subsection{Target Effectiveness Ranking Using Analytic Hierarchy Process (AHP)}

We constructed three different indices, which need to be considered together in the final decision of priorities. We chose the commonly used Analytic Hierarchy Process(AHP) to make decisions. AHP is a  method for making decisions on complex systems that are difficult to fully quantify, and is characterized by using only the relative importance between each pair of indices to get the weight of each index\cite{nguyenAnalyticHierarchyProcess}.

The hierarchy of the AHP we use is as follows.
\begin{itemize}
  \item Target layer: determine the priorities.
  \item Criterion layer: three constructed indices --- $\mathbf{It}, \mathbf{Is}, \mathbf{Id}$.
  \item Program layer: 17 SDGs.
\end{itemize}

Since all SDGs are fully quantitative inside each index, the extended hierarchy form degenerates to simple single layer form, and we only need to construct one pairwise comparison matrix $A$:

\begin{equation}
  A = \begin{bmatrix}
    1   & 1/4  & 3  \\
    4   & 1    & 10 \\
    1/3 & 1/10 & 1
  \end{bmatrix},
\end{equation}
where $a_{ij}$ is the importance of index $i$ compared to that of index $j$, obviously there is $a_{ij} = 1 / a_{ji}$. In this matrix, subscript 1-3 represent $\mathbf{It}, \mathbf{Is}, \mathbf{Id}$ respectively.

We know that if $a_{ij} = 2$, $a_{jk} = 2$, then there should be $a_{ik} = 4$, and the matrix satisfying this condition for any pair of $i,j$ is called consistent matrix. But we do not have to consider this restriction when constructing $A$ in AHP --- we can derive the weight of each index from a non-consistent matrix. First, we need to check the consistency degree of $A$. It can be proved that for a matrix $A$ with $n\times n$ elements, its largest eigenvalue satisfies $\lambda_{\max} \ge n$. Define consistency metric $CI = (\lambda_{\max} - n) / (n - 1)$, and we can obtain random consistency index $RCI$ by randomly generate a large number of $A$ and calculate the average $CI$. The specific values are listed in table \ref{tab:RCI}. If we have $CI / RCI < 0.1$, we can say that $A$ is not too inconsistent and the result obtained by the AHP is acceptable.

\begin{table}[H]
  \centering
  \caption{RCI values of different matrix size $n$}
  \label{tab:RCI}
  \begin{tabular}{cccccccccc}
    \toprule
    1    & 2    & 3    & 4    & 5    & 6    & 7    & 8    & 9    & 10   \\
    \midrule
    0.00 & 0.00 & 0.58 & 0.90 & 1.12 & 1.24 & 1.32 & 1.41 & 1.45 & 1.49 \\
    \bottomrule
  \end{tabular}
\end{table}

For the specific $A$ shown above, we have $\lambda_{\max} = 3.0036946, CI / RCI = 0.003185 < 0.1$, means that $A$ is acceptable. Next, we use $\mathbf{x_{\max}}$, the normalized eigvector corresponding to $\lambda_{\max}$, as weights of indices: $\mathbf{x_{\max}} = [0.19537494, 0.73541944, 0.06920562]$.

The priority of each objective  $\mathbf{p}$ can be obtained using the following weighted average formula:

\begin{equation}
  p_i=\mathbf{x}_{\max}\cdot(It_i, Is_i,Id_i)
\end{equation}
Here is the visualization of  $\mathbf{p}$ (figure \ref{fig:p_visualization}):

\begin{figure}[H]
  \centering
  \includegraphics[width=\textwidth]{figures/priority.png}
  \caption{Bar Chart of priority vector $\mathbf{p}$}
  \label{fig:p_visualization}
\end{figure}

From the results we obtained, \emph{GOAL 9: Industry, Innovation and Infrastructure}, \emph{GOAL 7: Affordable and Clean Energy} and \emph{GOAL 16: Peace and Justice Strong Institutions} have the highest priority. The structural importance of SDG is the dominant factor in the prioritization. Interestingly, the SDGs on environmental issues (\emph{GOAL 14: Life Below Water}, \emph{GOAL 15: Life on Land}) are so difficult to achieve that they have to be given a higher priority. \emph{GOAL 10: Reduced Inequality} gets a higher priority because of its own importance.

\subsection{Achievement Prediction in the Next 10 Years}

It is extremely difficult to determine which goals can be achieved in 10 years under our priorities, because few quantitative criteria are given in the SDGs targets or indicators. To address this lack of criteria, we used the Long-Term Objectives(LTO) from SDR as quantitative criteria\cite{SustainableDevelopmentReport}. We think that LTO are more stringent than the criteria of SDGs, even though the latter we could hardly find on the Internet or even not exist.

In order to predict what can be achieved in 10 years under our priorities, we first have to predict what is likely to be achieved in 10 years under the existing trends as a baseline. To obtain the baseline, we introduce two methods: the first one gives the most optimistic estimate and the second one gives the more likely estimate. We obtained the data of the average value of each indicators worldwide over time by the method in the appendix.

The first method looks for the maximum rate of completion that has occurred in 2000-2021, and then calculates, based on the distance between the current completion and LTO: how many more years are needed to achieve a indicator at the maximum rate of completion. In our opinion, only those with a result less than 10 years are possible to be reached, either in reality or under our priorities. Because even if the world advance at our priorities, it is unlikely that a certain indicator will perform better than the best performance of all previous years in each of the next 10 years. We obtained that the 8 goals listed in table \ref{tab:possible_sdgs} are possible to be achieved, each goal has at least one indicator that can achieve LTO in 10 years if at the maximum rate of completion.

\begin{table}[H]
  \centering
  \caption{SDGs that are possible to achieve}
  \label{tab:possible_sdgs}
  \begin{tabular}{cccccccc}
    \toprule
    SDG1  & SDG2  & SDG3  & SDG4  & SDG7  & SDG10 & SDG11 & SDG12 \\
    \midrule
    5.032 & 9.247 & 9.458 & 7.611 & 6.828 & 3.925 & .570  & 2.569 \\
    \bottomrule
  \end{tabular}
\end{table}

For the second method we create a sliding window of width $W$ and use data of year $x - 1, \ldots , x - W$ to fit a polynomial of degree $d$ to obtain the predicted value of year $x$ ($W > d + 1$), after which, the sliding window is shifted, again we use data of year $x, \ldots , x - W + 1$ to predict the value of year $x + 1$ \ldots iterate 10 times. The results were somewhat discouraging: none of the indicators could achieve LTO after 10 years, perhaps because, as mentioned above, the criteria we chose are too strict.

Finally, consider the impact of our priorities. We decide to take the first 9 items of the priorities we obtained, treating them as SDGs to be focused on advancing in the future, and taking the intersection with SGDs listed in table \ref{tab:possible_sdgs}. We get: \Emph{SDG1, SDG2, SDG3, SDG7} may be achieved. This is our conclusion.

\section{Problem Three: Impact of SDG Target Completion on the Network and Proposing New Goals}

\subsection{The Impact of SDG Target Completion}

When a goal is completed, its impact in the whole network will disappear, possibly changing the specific architecture of the network and thus affecting the value of Target Structure Importance Index $\mathbf{Is}$. We take SDG1 (No Poverty) and SDG2 (Zero Hunger) as examples to analyze the changes to the network architecture (figure \ref{fig:impact_of_sdg1sdg2}).

\begin{figure}[H]
  \centering
  \includegraphics[width=\textwidth]{figures/Finish Goal.png}
  \caption{Impact of achieving SDG1 and SDG2}
  \label{fig:impact_of_sdg1sdg2}
\end{figure}

It is noticeable that there are a few changes in the order between SDGs, such as the increased structural priority of GOAL 16: Peace and Justice Strong Institutions after completing SDG1: No Poverty, but the overall network architecture does not change very much. The change in priority among these SDGs is understandable, as in the example mentioned above, peace and building strong institutions become more important when people address poverty; it is also reasonable that the overall network architecture does not change significantly, as the network among the SDGs is very complex and the interactions among the 17 SDGs do not change dramatically because of the absence of a particular SDG.

The priority of our targets is also influenced by the Target Importance Index and the Target Difficulty Index. When an SDG is completed, the corresponding Target Importance Index obtained from the expert survey may decrease, while the indexes of other SDGs may increase. Since the completion of an SDG is not possible in a short time, the process of completing the whole SDG may vary greatly, so we cannot give a reasonable estimate of the Target Difficulty Index directly. By the same token, the Target Difficulty Index will also vary with the actual situation. But fortunately, we can get a reasonable prioritization as long as we give the data needed for our model according to the actual situation.

As an example, however, we made a rough estimate of the change in target priority after completing SDG1, as shown in the following figure:

\begin{figure}[H]
  \centering
  \includegraphics[width=\textwidth]{figures/priority (Finish goal 1).png}
  \caption{Priorities after SGD1 is achieved}
  \label{fig:SDG1_achieved}
\end{figure}

\subsection{New Goals}

As is known to all, culture is an indispensable part of the development of human societies, and it contains the essence of the world's civilizations and the wealth of knowledge left behind in the process of human development. Therefore, in the following, we propose 2 objectives, mainly in terms of culture, and give a reasonable explanation of them, as well as other objectives that may have a strong connection with it.

\begin{itemize}
  \item \Emph{Protection of cultural heritage} Cultural heritage is a lot of material and spiritual wealth left behind by human civilization in the course of its development from ancient times to the present. They record the footprints of human social development and represent the cultural characteristics of different regions and peoples, and are very helpful both in building national cultural confidence and in enriching people's spiritual world. However, with about 40\% of the world's languages facing disappearance and many intangible cultural heritage facing disappearance, the importance of cultural heritage becomes more and more significant. In terms of SDG, it has a great impact on Partnerships to Achieve the Goal, Peace and Justice Strong Institutions, and will also improve people's Well-being to a certain extent.
  \item \Emph{Knowledge sharing} With the development of technology, nowadays, knowledge is already showing an explosive growth trend. Meanwhile, the development of the Internet has also made it easier for people to share information. However, monopolization of knowledge is a common phenomenon. The monopoly of large countries over small countries in terms of technology and the monopoly of a certain data site over the data of scientific research results have somehow divided people according to the difficulty of acquiring knowledge, which is very unfavorable to the implementation of Reduced Inequality and largely hinders the economic development of some poorer regions. Therefore, we believe that there is a need to strengthen the degree of knowledge sharing, so that we can better promote the common progress of all human beings.
\end{itemize}

\section{Problem Four: Influences of External Factors from the Network Perspective}

The world is universally connected, after a crisis, $\mathbf{It}, \mathbf{Is}, \mathbf{Id}$ in our network are all affected. However, for the model we built, estimating changes in $\mathbf{It}, \mathbf{Id}$ is easier, while estimating changes in $\mathbf{Is}$ is much more difficult. Our model builds NLP semantic networks and covariance networks from real data, and when a crisis arises, we cannot predict how the UN will modify its SDGs description or how each data in the real world will change. Therefore, although we think that $\mathbf{Is}$ will be affected, we will not actually change $\mathbf{Is}$ in our calculation as these unknown changes will be partially reflected in the changes of $\mathbf{It}, \mathbf{Id}$.

For $\mathbf{It}, \mathbf{Id}$, we believe that the impact of a crisis can be divided into direct effect and indirect effect, so we simulate the network after a crisis occurs in the following way: the directly affected goals are selected according to the type of crisis, and their $It_i, Id_i$ that is notably affected are raised or lowered. Due to the normalization steps in our process, the impact will automatically spread to the rest of the indirectly affected goals. Table \ref{tab:problem_4} lists the goals considered to be directly affected and the priorities under each crisis.

\begin{table}[H]
  \centering
  \caption{Directly affected SDGs and new priorities in case of each crisis}
  \label{tab:problem_4}
  \resizebox{\textwidth}{!}{
  \begin{tabular}{lll@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l@{\enspace}l}
    \toprule
  Crisis                 &     SDGs    & \multicolumn{17}{c}{Priorities}                                                  \\
    \midrule
  No Crisis              & /           & 9  & 7 & 16 & 17 & 6  & 5  & 3 & 1  & 2  & 15 & 10 & 14 & 4  & 12 & 8  & 11 & 13 \\
  Technological advances & 7,8,9,11,12 & 9  & 7 & 16 & 17 & 6  & 5  & 3 & 1  & 15 & 2  & 14 & 10 & 4  & 12 & 11 & 8  & 13 \\
  Global pandemics       & 3,6         & 6  & 3 & 9  & 7  & 17 & 16 & 5 & 2  & 1  & 15 & 14 & 10 & 4  & 12 & 11 & 8  & 13 \\
  Climate change         & 11,13,14,15 & 9  & 7 & 17 & 16 & 6  & 15 & 5 & 3  & 14 & 13 & 2  & 1  & 10 & 4  & 12 & 11 & 8  \\
  Regional wars          & 5,16        & 16 & 9 & 5  & 7  & 17 & 6  & 3 & 1  & 2  & 15 & 14 & 10 & 4  & 12 & 11 & 8  & 13 \\
  Refugee movements      & 5,6,10,16   & 16 & 9 & 5  & 7  & 6  & 17 & 3 & 10 & 2  & 1  & 15 & 14 & 4  & 12 & 11 & 8  & 13 \\
    \bottomrule
  \end{tabular}}
\end{table}

The results show that under our assumptions, a crisis will change priorities, the change is highly related to the crisis and is not significant. This shows that our model has good stability.

\section{Problem Five: Help Companies and Organizations with Our Model}

Our model can help other companies and organizations set priorities of their goals in aspects as follows.

\begin{itemize}
  \item \Emph{Companies or organizations can leverage multiple kinds of data.} Decision makers can use both textual data and statistical data to make decisions. Textual data generally comes from some company documents, such as meeting minutes, work reports, etc. It can also come from forms such as questionnaires or using data from online forums to obtain people's opinions on these goals, which reflect more subjective considerations. Statistical data can come from the company's financial data, sales data, product quality testing data, stock trends, number of stockholders, and other numerical data, which reflect more the reality of these goals. The importance of each goal can be decided through meetings, listed companies can conduct shareholder polls, civil organizations can also conduct some public opinion surveys, etc.; while the difficulty is completely determined by objective data, and the difficulty of goal accomplishment can be deduced from various data of the company or organization over the years. This can make our results reflect both the objective needs and to a certain extent the subjective will of people. At the same time, the introduction of hierarchical analysis method reduces the difficulty of subjective judgment of decision makers, which can improve the correctness of decision making and facilitate finding the right goal.
  \item \Emph{Companies or organizations can make modifications to the goals.} According to our model, it is only necessary to remove this goal from the diagram already reached and add a new goal to the diagram, and then re-analyze the structure of the diagram, combine other factors to give a new network structure and get a new priority from it. This improves the flexibility of the model to some extent, gives room for error tolerance in the company or organization's decision making, and facilitates them to adjust the plan according to the actual situation, so as to meet the actual needs and make decisions oriented to more long-term benefits.
  \Emph{The model requires the company or organization to set quantifiable achievement conditions for its goals.} This not only facilitates the data processing of the model, but also facilitates the development of the company. It is important to have clear and quantifiable goals, because it makes the company or organization's efforts clear and facilitates the design and implementation of phased plans, which is beneficial in both the short and long term.
\end{itemize}

\section{Extra Experiments}

\subsection{Convergence Analysis}

The purpose of the PageRank algorithm mentioned in Section 5.2 is to converge the structural importance vector $\mathbf{Is}$ to a certain value through iterations. Here we will use different damping factor $d$, for the convergence process to be analyzed. The results are shown in figure \ref{fig:convergence_analysis} (L2 distance of $\mathbf{Is}$ between adjacent iteration).

\begin{figure}[H]
  \centering
  \includegraphics[width=0.6\textwidth]{figures/convergence.png}
  \caption{Convergence curve}
  \label{fig:convergence_analysis}
\end{figure}

\subsection{Sensitivity Analysis}

In Section 6.1, we analyzed the performance of our constructed SDG network after the deletion of one node, which to a large extent already reflects the stability of the network architecture. In fact, deleting three or even more nodes still does not rank much for the overall impact of the network and the importance of the SDG structure. We believe that this is reasonable for our construction of the network.

\section{Strengths and Weaknesses}

\subsection{Strengths}

\begin{itemize}
  \item \Emph{Our model takes into account both subjective and objective factors.} First, we used both textual data and statistical data for the analysis. The textual data used the content of the indicators for each of the 17 targets and analyzed the similarity of the targets using NLP methods; the statistical data used the individual indicators for each target as a measure and used it to calculate the correlation between the targets, and finally combined the two for statistical purposes. Secondly, we used the importance levels from the results of the votes on CITE and calculated the weights using hierarchical analysis, which reflect the subjective factors; while the measurement of the difficulty of each goal was calculated entirely from the data, which reflects the objective factors. The combination of subjective and objective allows the decision results to be more realistic and more in line with people's needs.
  \item \Emph{Our model makes good use of information from multiple sources.} For example, the model uses text data to calculate the similarity between the objectives, but the text data can only reflect the similarity, not the correlation positive or negative, while using statistical data, you can get the correlation positive or negative, multiply the two to get the similarity will have both the correlation size and correlation positive or negative, but also a combination of subjective and objective factors, each other to compensate for their respective shortcomings. For example, three priority indicators were calculated in the model, but lacked a weight to determine whether they should have a greater impact, so hierarchical analysis was used to determine their weights, and a weighted average was performed. The use of multiple sources of information can be a good way to compensate for the one-sidedness caused by using only one source of information and to better produce appropriate results.
  \item \Emph{Our model can be modified in the process of using the target.} According to our model, it is only necessary to remove from the graph that this goal has been reached and to add a new goal to the graph, then the structure of the graph is reanalyzed to give a new network structure in combination with other factors and to obtain a new priority from it. This improves the flexibility of the model to a certain extent, and after one of the objectives is solved, or when a new urgent objective is found, the network can be easily modified and new decision results can be derived, thus meeting the practical needs and orienting the decision towards a more long-term sustainable development goal.
  \Emph{Our model is standardized for different types of statistics.} Firstly, we normalize the different influencing factors, so as to avoid the difference in order of magnitude from adversely affecting the calculation results; secondly, we transform all types of numerical indicators for each goal into temporal data, so as to unify the magnitudes of different indicators. After these treatments, the various influencing factors in the model can be treated equally, and the measurementindicators of different targets can be compared, thus reflecting the progress of the completion of various targets and allowing decision makers to grasp the current development more intuitively.
\end{itemize}

\subsection{Weaknesses}

\begin{itemize}
  \item \Emph{This model has certain requirements for the quality of the data.} While we use a combination of textual data and statistical data to build the network between SDGs, relevant and accurate are often harder to obtain and evaluate, and additional work may be possible to optimize and evaluate the reasonableness of the built network.
  \item \Emph{The present model may contain subjective factors that are difficult to quantify precisely.} SDG prioritization covers many aspects of the discussion, including but not limited to political, ethical, and other aspects, which cannot be ignored. We have attempted to quantify these discussions objectively, but opinions may vary from person to person, leading to different results.
\end{itemize}

\bibliography{ref}

\newpage

\begin{appendices}

  \section{Implementation Details of Sentence-BERT}
  \label{app:sbert}
  Sentence-BERT model implementation and weights are from the website \emph{sBERT.net}\cite{sbert}, which have been trained on 215M question-answer pairs from various sources and domains, including StackExchange, Yahoo Answers, Google \& Bing search queries and many more. 

  The model we use in this experiment is named ``multi-qa-MiniLM-L6-cos-v1'', which can input sentences of length up to $512$ and transform them into vectors of $384$ dimensions, and the vectors have been converted to unit vectors. This model are trained to perform semantic searching tasks, and is optimized for cosine-similarity functions.

  \section{Data Cleaning and Processing}
  \label{app:data_cleaning}
  The indicators in the Sustainable Development Report\cite{SustainableDevelopmentReport} are not exactly the same as those in the official UNSD SDGs API\cite{UNSDSDGsAPI}, so we selected indicators that were rich in data and as close to the official ones as possible, and considered whether a indicator could be quantified in order to obtain the corresponding SDG difficulty. The data thus obtained were still very incomplete in terms of years and countries and should be processed properly. Firstly, we removed the countries with little data, which are helpless for our data analyzing. Secondly, we obtained the population of all the countries in the world to be the weight to calculate the weighted average of the data each year. We considered the population as the weight because the data was mostly related to people. After that, we got the worldwide data among years from 2000 to 2021. Finally, we used the SPSS to do the interpolation to fill the missing data, which is used as the original data to calculate the target difficulty index, achievement prediction and so on.

  While calculating the correlation between every pair of indicators, there is an extra data processing step. The expected trends of the indicators are not the same. For example, we expect the participation rate in pre-primary organized learning to increase and the poverty rate to be negative, that is, they should be positive related, but they will be negative related in the data, which is unexpected. Therefore, we defined the progress of goals, which can unify the expected trends of all the indicators. Before the definition of the goal progress, we used the method in \cite{bidarbakhtnia2020measuring} to calculate the progresses of indicators.
  \begin{equation}
    P_d = \left(1 - \frac{|T_d-I_d|}{|T_d-I^*_{d}|} \right)\times 10
  \end{equation}
  in which $I_d$ indicates the value of the indicator $d$ in 2021, $I^*_d$ indicates the value of the indicator in 2000, and $T_d$ indicates the target value of the indicator. Then the progress of goals is defined as follows:
  \begin{equation}
    P_g = \frac{1}{|IND_g|}\sum_{d \in IND_g}P_d
  \end{equation}

%%%%%%%%%%%%%       %%%%%%%%%       %%%%%%%%%%%%%
%            %          %           %            %
%             %         %           %             %
%            %          %           %            %
%%%%%%%%%%%%%           %           %%%%%%%%%%%%%  
%%                      %           %
%  %                    %           %
%    %                  %           %
%       %               %           %
%          %        %%%%%%%%%       %


  % and finally calculated the weighted average of each indicator based on the population of each country in that year to obtain the following table.

  % \begin{table}[H]
  %   \caption{The average of some of the selected indicators, 2000-2021}
  %   \resizebox{\textwidth}{!}{%
  %     \begin{tabular}{cccccccccccccccc}
  %       \toprule
  %       1.1   & 2.1   & 2.2   & 3.1    & 3.2   & 4.1   & 4.2   & 5.1   & 5.2   & 6.1   & 6.2   & 7.1   & 8.2  & 9.1  & 10.1  & \ldots \\
  %       \midrule
  %       10.16 & 10.14 & 27.84 & 222.87 & 56.71 & 74.85 & 87.11 & 77.35 & 64.52 & 80.78 & 55.37 & 78.59 & 0.48 & 6.16 & 38.35 & \ldots \\
  %       9.67  & 13.12 & 24.58 & 217.49 & 54.51 & 73.37 & 88.34 & 80.89 & 64.53 & 81.14 & 56.25 & 78.07 & 0.47 & 5.87 & 41.32 & \ldots \\
  %       10.67 & 13.17 & 24.91 & 212.04 & 52.31 & 73.79 & 88.57 & 80.80 & 64.54 & 81.78 & 57.48 & 79.26 & 0.48 & 5.97 & 41.66 & \ldots \\
  %       10.16 & 13.12 & 24.78 & 206.64 & 50.17 & 77.20 & 90.71 & 80.58 & 64.67 & 82.29 & 58.63 & 79.87 & 0.48 & 6.32 & 39.99 & \ldots \\
  %       9.97  & 12.83 & 24.91 & 200.33 & 48.38 & 76.71 & 92.44 & 80.46 & 64.70 & 82.78 & 59.79 & 80.02 & 0.48 & 6.44 & 40.30 & \ldots \\
  %       9.56  & 12.33 & 26.25 & 192.25 & 46.14 & 73.20 & 92.97 & 76.28 & 64.86 & 84.06 & 61.09 & 80.25 & 0.48 & 6.58 & 40.47 & \ldots \\
  %       10.47 & 11.55 & 24.86 & 184.94 & 44.25 & 73.82 & 92.87 & 81.04 & 64.64 & 84.52 & 62.28 & 81.33 & 0.48 & 6.66 & 39.94 & \ldots \\
  %       9.91  & 10.79 & 24.18 & 178.21 & 42.45 & 76.44 & 95.03 & 80.32 & 64.52 & 84.97 & 63.47 & 82.28 & 0.48 & 6.64 & 39.35 & \ldots \\
  %       10.04 & 10.28 & 25.03 & 172.86 & 40.93 & 79.62 & 95.46 & 81.63 & 64.22 & 85.41 & 64.67 & 82.32 & 0.48 & 6.83 & 39.96 & \ldots \\
  %       10.50 & 9.79  & 24.29 & 167.40 & 39.07 & 77.91 & 95.02 & 79.69 & 64.17 & 85.84 & 65.86 & 82.79 & 0.48 & 7.17 & 38.93 & \ldots \\
  %       14.99 & 9.52  & 24.75 & 161.90 & 37.68 & 81.53 & 95.35 & 79.03 & 63.91 & 86.27 & 67.03 & 83.16 & 0.48 & 7.21 & 37.82 & \ldots \\
  %       13.67 & 9.28  & 24.47 & 156.81 & 36.05 & 80.21 & 95.55 & 79.73 & 63.67 & 86.69 & 68.19 & 81.95 & 0.48 & 7.37 & 39.40 & \ldots \\
  %       12.52 & 9.14  & 24.56 & 152.58 & 34.71 & 79.62 & 95.99 & 80.74 & 63.52 & 87.10 & 69.36 & 84.51 & 0.48 & 7.43 & 40.10 & \ldots \\
  %       10.88 & 8.95  & 25.31 & 148.81 & 33.41 & 84.03 & 96.21 & 81.41 & 63.56 & 87.51 & 70.52 & 84.80 & 0.48 & 7.60 & 39.53 & \ldots \\
  %       10.22 & 8.79  & 24.66 & 145.60 & 32.20 & 85.03 & 95.35 & 82.28 & 63.61 & 87.92 & 71.68 & 85.24 & 0.49 & 7.72 & 39.93 & \ldots \\
  %       9.51  & 8.69  & 23.35 & 142.24 & 31.03 & 86.06 & 96.03 & 82.03 & 63.74 & 88.32 & 72.84 & 86.38 & 0.46 & 7.80 & 39.18 & \ldots \\
  %       9.06  & 8.65  & 24.95 & 138.61 & 29.96 & 80.56 & 96.13 & 82.19 & 63.84 & 88.72 & 73.98 & 87.43 & 0.47 & 7.81 & 39.56 & \ldots \\
  %       8.67  & 8.63  & 22.98 & 136.28 & 28.87 & 83.24 & 96.42 & 82.26 & 63.98 & 89.06 & 75.00 & 88.33 & 0.47 & 8.07 & 39.35 & \ldots \\
  %       8.12  & 8.69  & 25.04 & 174.09 & 27.87 & 85.02 & 96.70 & 82.27 & 64.01 & 89.43 & 76.11 & 89.16 & 0.49 & 8.38 & 40.17 & \ldots \\
  %       7.71  & 9.24  & 22.64 & 171.68 & 26.97 & 83.77 & 96.83 & 83.94 & 64.35 & 89.78 & 77.09 & 89.84 & 0.49 & 8.18 & 42.63 & \ldots \\
  %       8.68  & 10.00 & 24.52 & 172.78 & 26.12 & 70.63 & 93.83 & 80.88 & 63.32 & 90.11 & 78.12 & 83.61 & 0.49 & 7.98 & 39.52 & \ldots \\
  %       8.24  & 10.01 & 25.05 & 177.07 & 37.69 & 55.39 & 88.99 & 80.69 & 63.98 & 86.24 & 66.77 & 82.53 & 0.47 & 7.03 & 39.69 & \ldots \\
  %       \bottomrule
  %     \end{tabular}%
  %   }
  % \end{table}


\end{appendices}
\end{document}
